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May 2004 ACM Transactions on Embedded Computing Systems (TECS), Volume 3 issue 2 
Full text available: Q pdf(467.82 KB) Additional Information: full citation , abstract , referenc es, index terms 

Increasing nonrecurring engineering and mask costs are making it harder to turn to 
hardwired application specific integrated circuit (ASIC) solutions for high-performance 
applications. The volume required to amortize these high costs has been increasing, making 
it increasingly expensive to afford ASIC solutions for medium-volume products. This has led 
to designers seeking programmable solutions of varying sorts using these so-called 
programmable platforms. These programmable platforms span a lar ... 

Keywords: Loop pipelining, coarse-grain reconfigurable fabric, datapath synthesis, 
interconnection design, reconfigurable datapath 
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Full text available: 'flpdf d 98. 92 KB) Additional Information: full citation, abstract, references, ind.ex.M.m.s 

Verification is one of the most complex and expensive tasks in the current Systems-on-Chip 
design process. Many existing approaches employ a bottom-up approach to pipeline 
validation, where the functionality of an existing pipelined processor is, in essence, reverse- 
engineered from its RT-level implementation. Our validation technique is complementary to 
these bottom-up approaches. Our approach leverages the system architect's knowledge 
about the behavior of the pipelined architecture, through a ... 

Keywords: Modeling of processor pipeline, architecture description language, pipeline 
validation, pipelined processor specification 
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The VAX Architecture has been extended to include an integrated, register-based vector 
processor. This extension allows both high-end and low-end implementations and can be 
supported with only small changes by VAX/VMS and VAX/ULTRIX operating systems. The 
extension is effectively exploited by the new vectorizing capabilities of VAX FORTRAN. 
Features of the VAX Vector Architecture and the design decisions which make it a consistent 
extensi on of the VAX Architecture are , 
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August 1993 Proce edings of the 7 t h international c onference on Supercomputing 
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Full text available: ll pdf(1 .36 MB) 
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Shared-memory provides a uniform and attractive mechanism for communication. For 
efficiency, it is often implemented with a layer of interpretive hardware on top of a 
message-passing communications network. This interpretive layer is responsible for data 
location, data movement, and cache coherence. It uses patterns of communication that 
benefit common programming styles, but which are only heuristics. This suggests that 
certain styles of communication may benefit from direct access to the ... 
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J. T. Canning, R. Miner 

February 1989 Proceedings of the seventeenth annual ACM conference on Computer 
science : Computing trends in the 1990's: Computing trends in the 
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. t « *~r ^ Additional Information: full citation , abstract, referenc es, citings, index 

Full text available: ^.p.df(594,1 7 KB) " 

A parallel pipelined data flow coprocessor has been developed for the 68000 based 
Commodore Amiga workstation. The coprocessor, based on Nippon Electric Corporation's 
&mgr;PD7281 Image Pipelined Processor (ImPP), was designed as an algorithm processor 
for numerically intensive applications such as image processing, image synthesis, and 
numerical analysis. The coprocessor can accomodate up to seven of the 5-MIPS ImPP's 
providing over 30 MIPS of processing power to dedicated ... 
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Verification is one of the most complex and expensive tasks in the current Systems-on-Chip 
(SOC) design process. Many existing approaches employ a bottom-up approach to pipeline 
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validation, where the functionality of an existing pipelined processor is, in essence, reverse- 
engineered from its RT-level implementation. Our approach leverages the system 
architect's knowledge about the behavior of the pipelined architecture, through Architecture 
Description Language (ADL) constructs, and thus allows ... 

Keywords: Architecture Description Language, Pipeline Verification 
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This paper discusses some of the issues involved in implementing a shared-address space 
programming model on large : scale, distributed-memory multiprocessors. While such a 
programming model can be implemented on both shared-memory and message-passing 
architectures, we argue that the transparent, coherent caching of global data provided by 
many shared-memory architectures is of crucial importance. Because message-passing 
mechanisms ar much more efficient than shared-memory loads and stores fo ... 

11 Code compression: Reduc ing code siz e for hetero g eneous-con nerti^^ 
DSPs through synthesis o f i ns t ruction set extensions 

Partha Biswas, Nikil Dutt 

October 2003 Proceedings of the international conference on Compilers, architectures 
and synthesis for embedded systems 

Full text available: IB pdf(1 76.82 KB ) Additional Information: full citation, abstract, references, index term 

VLIW DSP architectures exhibit heterogeneous connections between functional units and 
register files for speeding up special tasks. Such architectural characteristics can be 
effectively exploited through the use of complex instruction set extensions (ISEs). Although 
VLIWs are increasingly being used for DSP applications to achieve very high performance, 
such architectures are known to suffer from increased code size. This paper addresses how 
to generate ISEs that can result in significant code s ... 

Keywords: dependence conflict graph, heterogeneous-connectivity-based DSP, instruction 
set architecture, instruction set extensions, restricted data dependence graph, static single 
assignment 
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Increasing non-recurring engineering (NRE) and mask costs are making it harder to turn to 
hardwired Application Specific Integrated Circuit (ASIC) solutions for high performance 
applications [12]. The volume required to amortize these high costs has been increasing, 
making it increasingly expensive to afford ASIC solutions for medium volume products. This 
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has led to designers seeking programmable solutions of varying sorts using these so-called 
programmable platforms. These programmable platform ... 

13 Em bedded systems: ap plicatio ns, s o lutions and t echniques (EMBS): A 
hardware/software kernel for system on chip design s 

Andrew Morton, Wayne M. Loucks 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 

Full text available: ||| p df(997.76 KB) Additional Information: full citation , abstract , r eferences 

As part of the SoC design process, the application is partitioned between implementation in 
hardware and implementation in software. While it is customarily the application that is 
subject to partitioning, it is also possible to partition the software kernel. In this paper, a 
uniprocessor real-time kernel that implements the Earliest Deadline First (EDF) scheduling 
policy is partitioned. It is partitioned by moving the EDF scheduler into a coprocessor. The 
coprocessor size and performance are an ... 

Keywords: SoC, hardware/software codesign, operating systems 
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May 2002 Proceedings of the tenth international symposium on Hardware/ software 
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Full text available: g pdf(646.07 KB) Additional Information: full citation , abstra ct, references, Index terms 

Eclipse defines a heterogeneous multiprocessor architecture template for data-dependent 
stream processing. Intended as a scalable and flexible subsystem of forthcoming media- 
processing systems-on-a-chip, Eclipse combines application configuration flexibility with the 
efficiency of function-specific hardware, or coprocessors. To facilitate reuse, Eclipse 
separates coprocessor functionality from generic support that addresses multi-tasking, 
inter-task synchronization, and data transport. Fi ... 

16 A practic al to o l box for s y stem le vel c o mmunication synthesis 
Denis Hommais, Frederic Petrot, Ivan Auge 

April 2001 Proceedings of the ninth international symposium on Hardware /software 
codesign 

Full text available fjlpdf(5Q 5.13 KB ) Additional Information: full citation, abstract, references, jodexterms 



This paper presents a practical approach to communication synthesis for hardware/software 
system specified as tasks communicating through lossless blocking channels. It relies on a 
limited set of templates that characterize the way data are exchanged between tasks 
realized either in software or in hardware. The templates are highly portable because their 
software part is implemented using the POSIX thread functions, and their hardware part is 
a hand crafted synthesizable module with a System ... 

17 Architectural and organiz a tional trade offs in the desig n of the MultiTitan CPU 
N. P. Jouppi 

April 1989 ACM SIGARCH Computer Architecture News , Proceedings of the 16th 

annual international symposium on Computer architecture, volume 17 issue 3 
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This paper describes the architectural and organizational tradeoffs made during the design 
of the MultiTitan, and provides data supporting the decisions made. These decisions 
covered the entire space of processor design, from the instruction set and virtual memory 
architecture through the pipeline and organization of the machine. In particular, some of 
the tradeoffs involved the use of an on-chip instruction cache with off-chip TLB and floating- 
point unit, the use of direct-mapped instead o ... 

18 StaCS : a Static Control Superscalar architecture 
BenoTt Dupont de Dinechin 

December 1992 ACM SIGMICRO Newsletter , Proceedings of the 25th annual 

international symposium on Microarchitecture, volume 23 issue 1-2 
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In this paper we present a unified approach to vector and scalar computation, using a single 
register file for both scalar operands and vector elements. The goal of this architecture is to 
yield improved scalar performance while broadening the range of vectorizable applications. 
For example, reduction operations and recurrences can be expressed in vector form in this 
architecture. This approach results in greater overall performance for most applications 
than does the approach of emphasizin ... 
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